Minimizing Cache Misses in Scientific Computing Using Isoperimetric Bodies

نویسندگان

  • Michael A. Frumkin
  • Rob F. Van der Wijngaart
چکیده

A number of known techniques for improving cache performance in scientific computations involve the reordering of the iteration space. Some of these reorderings can be considered coverings of the iteration space with sets having small surfaceto-volume ratios. Use of such sets may reduce the number of cache misses in computations of local operators having the iteration space as their domain. First, we derive lower bounds on cache misses that any algorithm must suffer while computing a local operator on a grid. Then, we explore coverings of iteration spaces of structured and unstructured discretization grid operators which allow us to approach these lower bounds. For structured grids we introduce a covering by successive minima tiles based on the interference lattice of the grid. We show that the covering has a small surface-to-volume ratio and present a computer experiment showing actual reduction of the cache misses achieved by using these tiles. For planar unstructured grids we show existence of a covering which reduces the number of cache misses to the level of that of structured grids. Next, we introduce a class of multidimensional grids, called starry grids in this paper. These grids represent an abstraction of unstructured grids used in, for example, molecular simulations and the solution of partial differential equations. We show that starry grids can be covered by sets having a low surface-to-volume ratio and, hence have the same cache efficiency as structured grids. Finally, we present a triangulation of a three-dimensional cube that has the property that any local operator on the corresponding grid must incur a significantly larger number of cache misses than a similar operator on a structured

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Adaptive Cache Placement for Scientific Computation

The central data structures for many applications in scientific computing are large multidimensional arrays. These arrays dominate memory accesses and are often accessed with strides that vary across orthogonal dimensions posing a central and critical challenge to develop effective caching strategies. We propose a novel technique to optimize cache placement for multidimensional arrays with the ...

متن کامل

Using Minimum-Surface Bodies for Iteration Space Partitioning

A number of known techniques for improving cache performance in scientific computations involve the reordering of the iteration space. Some of these reorderings can be considered as coverings of the iteration space with the sets having good surface-to-volume ratio. Use of such sets reduces the number of cache misses in computations of local operators having the iteration space as a domain. We s...

متن کامل

Randomized Cache Placement for Eliminating Connicts

| Applications with regular patterns of memory access can experience high levels of cache connict misses. In shared-memory multiprocessors connict misses can be increased signiicantly by the data transpositions required for parallelization. Techniques such as blocking which are introduced within a single thread to improve locality, can result in yet more connict misses. The tension between mini...

متن کامل

The Combinatorics of Cache Misses during Matrix Multiplication

In this paper we construct an analytic model of cache misses during matrix multiplication. The analysis in this paper applies to square matrices of size 2m where the array layout function is given in terms of a function that interleaves the bits in the binary expansions of the row and column indices. We first analyze the number of cache misses for direct-mapped caches and then indicate how to e...

متن کامل

Compiler Optimizations for Eliminating Cache Conflict Misses

Limited set-associativity in hardware caches can cause conflict misses when multiple data items map to the same cache locations. Conflict misses have been found to be a significant source of poor cache performance in scientific programs, particularly within loop nests. We present two compiler transformations to eliminate conflict misses: 1) modifying variable base addresses, 2) padding inner ar...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره cs.PF/0205062  شماره 

صفحات  -

تاریخ انتشار 2002